AITopics | mi dataset

Collaborating Authors

mi dataset

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Copycats

Neural Information Processing SystemsFeb-18-2026, 05:02:34 GMT

In the past, MI datasets were frequently proprietary, confined to particular institutions, and stored in private repositories. In this particular setting, there is a pressing need for alternative models of data sharing, documentation, and governance. Within this context,theemergence ofCommunityContributed Platforms (CCPs) presented a potential for the public sharing of medical datasets.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
North America > Canada > Ontario > Toronto (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(6 more...)

Genre: Research Report (0.46)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.69)
Health & Medicine > Therapeutic Area > Neurology (0.68)
Information Technology (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
(3 more...)

Add feedback

Copycats: the many lives of a publicly available medical imaging dataset Amelia Jiménez-Sánchez

Neural Information Processing SystemsOct-10-2025, 16:54:07 GMT

Medical Imaging (MI) datasets are fundamental to artificial intelligence in healthcare.

dataset, documentation, mi dataset, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
(9 more...)

Genre: Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Nuclear Medicine (1.00)
(2 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Copycats: the many lives of a publicly available medical imaging dataset

Jiménez-Sánchez, Amelia, Avlona, Natalia-Rozalia, Juodelyte, Dovile, Sourget, Théo, Vang-Larsen, Caroline, Rogers, Anna, Zając, Hubert Dariusz, Cheplygina, Veronika

arXiv.org Artificial IntelligenceSep-26-2025

Medical Imaging (MI) datasets are fundamental to artificial intelligence in healthcare. The accuracy, robustness, and fairness of diagnostic algorithms depend on the data (and its quality) used to train and evaluate the models. MI datasets used to be proprietary, but have become increasingly available to the public, including on community-contributed platforms (CCPs) like Kaggle or HuggingFace. While open data is important to enhance the redistribution of data's public value, we find that the current CCP governance model fails to uphold the quality needed and recommended practices for sharing, documenting, and evaluating datasets. In this paper, we conduct an analysis of publicly available machine learning datasets on CCPs, discussing datasets' context, and identifying limitations and gaps in the current CCP landscape. We highlight differences between MI and computer vision datasets, particularly in the potentially harmful downstream effects from poor adoption of recommended dataset management practices. We compare the analyzed datasets across several dimensions, including data sharing, data documentation, and maintenance. We find vague licenses, lack of persistent identifiers and storage, duplicates, and missing metadata, with differences between the platforms. Our research contributes to efforts in responsible data curation and AI algorithms for healthcare.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2402.06353

Country:

North America > United States (1.00)
Europe (0.67)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Nuclear Medicine (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Applied AI (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

EEG-Based Mental Imagery Task Adaptation via Ensemble of Weight-Decomposed Low-Rank Adapters

Lotey, Taveena, Verma, Aman, Roy, Partha Pratim

arXiv.org Artificial IntelligenceDec-8-2024

Electroencephalography (EEG) is widely researched for neural decoding in Brain Computer Interfaces (BCIs) as it is non-invasive, portable, and economical. However, EEG signals suffer from inter- and intra-subject variability, leading to poor performance. Recent technological advancements have led to deep learning (DL) models that have achieved high performance in various fields. However, such large models are compute- and resource-intensive and are a bottleneck for real-time neural decoding. Data distribution shift can be handled with the help of domain adaptation techniques of transfer learning (fine-tuning) and adversarial training that requires model parameter updates according to the target domain. One such recent technique is Parameter-efficient fine-tuning (PEFT), which requires only a small fraction of the total trainable parameters compared to fine-tuning the whole model. Therefore, we explored PEFT methods for adapting EEG-based mental imagery tasks. We considered two mental imagery tasks: speech imagery and motor imagery, as both of these tasks are instrumental in post-stroke neuro-rehabilitation. We proposed a novel ensemble of weight-decomposed low-rank adaptation methods, EDoRA, for parameter-efficient mental imagery task adaptation through EEG signal classification. The performance of the proposed PEFT method is validated on two publicly available datasets, one speech imagery, and the other motor imagery dataset. In extensive experiments and analysis, the proposed method has performed better than full fine-tune and state-of-the-art PEFT methods for mental imagery EEG classification.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-78195-7_21

2412.17818

Country:

Asia > India > Uttarakhand > Roorkee (0.04)
North America > United States > Arizona (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)

Add feedback

MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response

Deng, Zihao, Ma, Yinghao, Liu, Yudong, Guo, Rongchen, Zhang, Ge, Chen, Wenhu, Huang, Wenhao, Benetos, Emmanouil

arXiv.org Artificial IntelligenceOct-12-2023

Large Language Models (LLMs) have shown immense potential in multimodal applications, yet the convergence of textual and musical domains remains relatively unexplored. To address this gap, we present MusiLingo, a novel system for music caption generation and music-related query responses. MusiLingo employs a single projection layer to align music representations from the pre-trained frozen music audio model MERT with the frozen Vicuna-7B language model (an adaption of LLaMA), bridging the gap between music audio and textual contexts. We train it on an extensive music caption dataset and fine-tune it with instructional data. Due to the scarcity of high-quality music Q\&A datasets, we created the Music Instruct (MI) dataset from captions in the MusicCaps datasets, tailored for open-ended music inquiries. Empirical evaluations demonstrate its competitive performance in generating music captions and composing music-related Q&A pairs.

dataset, language model, music, (13 more...)

arXiv.org Artificial Intelligence

2309.0873

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Boosting Distress Support Dialogue Responses with Motivational Interviewing Strategy

Welivita, Anuradha, Pu, Pearl

arXiv.org Artificial IntelligenceMay-17-2023

AI-driven chatbots have become an emerging solution to address psychological distress. Due to the lack of psychotherapeutic data, researchers use dialogues scraped from online peer support forums to train them. But since the responses in such platforms are not given by professionals, they contain both conforming and non-conforming responses. In this work, we attempt to recognize these conforming and non-conforming response types present in online distress-support dialogues using labels adapted from a well-established behavioral coding scheme named Motivational Interviewing Treatment Integrity (MITI) code and show how some response types could be rephrased into a more MI adherent form that can, in turn, enable chatbot responses to be more compliant with the MI strategy. As a proof of concept, we build several rephrasers by fine-tuning Blender and GPT3 to rephrase MI non-adherent "Advise without permission" responses into "Advise with permission". We show how this can be achieved with the construction of pseudo-parallel corpora avoiding costs for human labor. Through automatic and human evaluation we show that in the presence of less training data, techniques such as prompting and data augmentation can be used to produce substantially good rephrasings that reflect the intended style and preserve the content of the original text.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2305.10195

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Portugal > Lisbon > Lisbon (0.04)
Asia > China > Hong Kong (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.93)

Add feedback

SMILe: Shuffled Multiple-Instance Learning

Doran, Gary (Case Western Reserve University) | Ray, Soumya (Case Western Reserve University)

AAAI ConferencesJul-9-2013

Resampling techniques such as bagging are often used in supervised learning to produce more accurate classifiers. In this work, we show that multiple-instance learning admits a different form of resampling, which we call "shuffling." In shuffling, we resample instances in such a way that the resulting bags are likely to be correctly labeled. We show that resampling results in both a reduction of bag label noise and a propagation of additional informative constraints to a multiple-instance classifier. We empirically evaluate shuffling in the context of multiple-instance classification and multiple-instance active learning and show that the approach leads to significant improvements in accuracy.

artificial intelligence, classifier, machine learning, (16 more...)

AAAI Conferences

Twenty-Seventh AAAI Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Ohio > Cuyahoga County > Cleveland (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback